Improved Spoken Query Transcription Using Co-Occurrence Information

نویسندگان

  • Jonathan Mamou
  • Abhinav Sethy
  • Bhuvana Ramabhadran
  • Ron Hoory
  • Paul Vozila
چکیده

Spoken queries are a natural medium for searching the Mobile Web. Language modeling for voice search recognition offers different challenges compared to more conventional speech applications. The challenges arise from the fact that spoken queries are usually a set of keywords and do not have a syntactic and grammatical structure. This paper describes a cooccurrence based approach to improve the accuracy of voice queries automatic transcription. With the right choice of scoring function and co-occurrence level, we show that co-occurrence information gives a 2% relative accuracy improvement over a state of the art system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

New Developments in Spoken Query Transcription

The rapid growth of mobile devices with the ability to browse the Internet has opened up interesting application areas for speech and natural language processing technologies. Voice search is one such application where speech technology is making a big impact by enabling people to access the Internet conveniently from mobile devices. Spoken queries are a natural medium for searching the Mobile ...

متن کامل

Segmented Spoken Document Retrieval Using Word Co-occurrence Information

This paper shows several approaches for NTCIR-11 SpokenQuery&Doc [1]. This paper proposes several schemes to use word co-occurrence information for spoken document retrieval. Automatic transcriptions of spoken documents usually contain mis-recognized words, making the performance of spoken document retrieval signi cantly decrease. The cosine similarity to measure a document similarity must be i...

متن کامل

Speech-Based Retrieval Using Semantic Co-Occurrence Filtering

In this paper we demonstra te that speech recognition can be effectively applied to information retrieval (IR) applications. Our system exploits the fact that the intended words of a spoken query tend to co-occur in text documents in close proximity whereas word combinations that are the result of recognition errors are usually not semantically correlated and thus do not appear together. Termed...

متن کامل

Effects of Query Expansion for Spoken Document Passage Retrieval

One of the major challenges for spoken document retrieval is how to handle speech recognition errors within the target documents. Query expansion is promising for this challenge. In this paper, we apply relevance models, a type of query expansion method, for the spoken document passage retrieval task. We adapted the original relevance model for passage retrieval. We also extended it to benefit ...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011